A log-linear discriminative modeling framework for speech recognition
نویسنده
چکیده
Conventional speech recognition systems are based on Gaussian hidden Markov models (HMMs). Discriminative techniques such as log-linear modeling have been investigated in speech recognition only recently. This thesis establishes a log-linear modeling framework in the context of discriminative training criteria, with examples from continuous speech recognition, part-of-speech tagging, and handwriting recognition. The focus will be on the theoretical and experimental comparison of different training algorithms. Equivalence relations for Gaussian and log-linear models in speech recognition are derived. It is shown how to incorporate a margin term into conventional discriminative training criteria like for example minimum phone error (MPE). This permits to evaluate directly the utility of the margin concept for string recognition. The equivalence relations and the margin-based training criteria lead to a unified view of three major training paradigms, namely Gaussian HMMs, log-linear models, and support vector machines (SVMs). Generalized iterative scaling (GIS) is traditionally used for the optimization of log-linear models with the maximum mutual information (MMI) criterion. This thesis suggests an extension of GIS to log-linear models including hidden variables, and to other training criteria (e.g. MPE). Finally, investigations on convex optimization in speech recognition are presented. Experimental results are provided for a variety of tasks, including the European Parliament plenary sessions task and Mandarin broadcasts.
منابع مشابه
Discriminative adaptation for log-linear acoustic models
Log-linear models have recently been used in acoustic modeling for speech recognition systems. This has been motivated by competitive results compared to systems based on Gaussian models, and a more direct parametrisation of the posterior model. To competitively use log-linear models for speech recognition, important methods, such as speaker adaptation, have to be reformulated in a log-linear f...
متن کاملDiscriminative Learning of Feature Functions of Generative Type in Speech Translation
The speech translation (ST) problem can be formulated as a log-linear model with multiple features that capture different levels of dependency between the input voice observation and the output translations. However, while the log-linear model itself is of discriminative nature, many of the feature functions are derived from generative models, which are usually estimated by conventional maxim...
متن کاملSimultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition
A method is proposed to incorporate mixture density splitting into the acoustic model discriminative training for speech recognition. The standard method is to obtain a high resolution acoustic model by maximum likelihood training and density splitting, and then improving this model by discriminative training. We choose a log-linear form of acoustic model because for a single Gaussian density p...
متن کاملRisk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised di...
متن کاملA Decade of Discriminative Language Modeling for Automatic Speech Recognition
This paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated fe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010